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One of the biggest huddles faced by researchers studying algorithms for massive graphs is 
the lack of large input graphs that are essential for the development and test of the graph 
algorithms. This paper proposes two efficient and highly scalable parallel graph generation 
algorithms that can produce massive realistic graphs to address this issue. The algorithms, 
designed to achieve high degree of parallelism by minimizing inter-processor communications, 
are two of the fastest graph generators which are capable of generating scale-free graphs with 
billions of vertices and edges. The synthetic graphs generated by the proposed methods possess 
the most common properties of real complex networks such as power-law degree distribution, 
small- worldness, and communities- within-communities. Scalability was tested on a large cluster 
at Lawrence Livermore National Laboratory. In the experiment, we were able to generate a 
graph with 1 billion vertices and 5 billion edges in less than 13 seconds. To the best of our 
£f) • knowledge, this is the largest synthetic scale-free graph reported in the literature. 

3 ■ 1 Introduction 



oo 



X 



Recent studies have revealed that many real-world graphs belong to a special class of graphs 
called complex networks (or graphs). Examples of the real- world complex graphs include World- 
Wide Web @], Internet [HJ H3 [251 ED], electric power grids [31], citation networks [T6| 126] 1251 
[29] . telephone call graphs PQ, and e-mail network [9]. These graphs typically carry a wealth of 
valuable information for their respective domains. Therefore, a great deal of research effort has 
been concentrated on developing algorithms to identify and mine certain knowledge or data of 
interest from these graphs. For example, algorithms that can find groups of vertices that have 
strong associations between them (called communities) have been reported [BJ [22], [231 121] • There 
exist algorithms which, given a template pattern, can find subgraphs that closely match to the 
input pattern [TTj. Such algorithms can play a very important role in detecting certain criminal 
activities or making critical business decisions. 

The real-world complex graphs are typically very large (with millions or more vertices) and their 
sizes grow over time. Some researchers predict that the size of these graphs will eventually reach 
10 vertices [15]. The high complexity of the graph algorithms, combined with the large and 
increasing size of the target graphs, however, makes these applications to be very difficult to apply 
to large real graphs. Efforts are being made to parallelize these applications [321 [2] and develop 
efficient out-of-core graph algorithms [27\ to cope with the technical challenges. 

One of key issues in developing these graph applications is the availability of large input graphs, 
as these graphs are essential for the developers to develop and test the applications and to measure 
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their scalability and performance. Unfortunately, we do not have publicly available real graphs that 
are large enough to test the functionality and true scalability of the graph applications. A social 
network graph derived from the World Wide Web, for example, contains 15 million vertices [TT] 
and the largest citation network available has two million vertices [21]. Although these real- world 
graphs tend to grow in size, it is unlikely that the real graphs of sufficiently large size will be 
available in the near future. 

The lack of the large graphs has forced the researchers to use synthetically generated random 
graphs, which are relatively cheap to construct, in their experiments |32j . The random graphs 
(also known as Erdos-Renyi random graphs [ID]), however, is uninformative, since the structure of 
the random graphs greatly differs from that of real-world graphs. In the absence of the large real 
graphs, synthetic graphs may be used for the development of the graph applications. There exist 
several good models to synthetically generate complex networks [31 \TE[ I3~T] . A serious drawback of 
these models is that they are all sequential models and hence, are inadequate to use to generate 
the massive graphs with billions of vertices and edges. 

In this paper, we propose two efficient and highly scalable graph generation methods. Based 
on serial models [3j Q2] , these methods are designed to generate massive scale-free graphs in par- 
allel on distributed parallel computers. These parallel generators require very little inter-processor 
communications and thus achieve high degree of parallelism. The first method, called parallel 
Barabasi- Albert (PBA) method, iteratively builds a graph using a technique called two-phase pref- 
erential attachment. The second parallel method, called parallel Kronecker (PK) method, applies 
the concept of Kronecker product of matrices [H] and constructs a graph recursively in a fractal 
fashion from a given seed graph. These are two of the fastest graph generation algorithms with 
capability of generating scale- free graphs with billions of vertices and edges. We have demonstrated 
their scalability by constructing massive graphs on a large cluster at Lawrence Livermore National 
Laboratory. In the experiment, we have generated a scale-free graph with 1 billion vertices and 5 
billion edges in less than 13 seconds, and to the best of our knowledge, this is the largest synthetic 
scale-free graph ever reported in the literature. We also have analyzed the properties of the graphs 
generated by the proposed methods and report the results in this paper. We have found that these 
graphs possess commonly known properties of real-world complex networks, including power-law 
degree distribution, small- worldness, and communities-within-communities. 

The remaining of the paper is organized as follows. Section [2] surveys the related work in the 
literature. The proposed parallel models are described in Section O Section 0] presents the results 
from performance and characterization study, followed by concluding remarks and directions for 
future work in Section [5j 

2 Related Work 

Erdos and Renyi have proposed a simple model that generates equilibrium random graphs, called 
Erdos-Renyi random graphs [10] . In this model, given a fixed number of vertices, a graph is con- 
structed by connecting randomly chosen vertices with an edge repeatedly until the predetermined 
number of edges are obtained. This model is restrictive in that it produces only Poisson degree 
distributions. 

Dorogovtsev et al. proposed a model that can generate graphs with fat-tailed degree distribu- 
tions [7]. Given a random graph, this model restructures the given graph by rewiring a randomly 
chosen end of a randomly chosen edge to a preferentially chosen vertex and also moving a randomly 
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chosen edge to a position between two preferentially chosen vertices at each step of the evolution. 

The model proposed by Watts and Strogatz [31] generates random structures with small diameter, 
which has been named as small-world graphs. This model transforms a regular one-dimensional 
lattice (with vertex degree of four or higher) by rewiring each edge, with certain probability, to 
a randomly chosen vertex. It has been found that, even with the small rewiring probability, the 
average shortest-path length of the resulting graphs is of the order of that of random graphs, and 
generate graphs with fat-tailed degree distributions. 

The majority of recent models uses a method called preferential attachment [6]. In a represen- 
tative model among these, proposed by Barabasi and Albert [3j, a new vertex joins the graph at 
each time step and gets connected to an existing vertex with probability proportional to the vertex 
degree. With preferential attachment, these models can emulate the dynamic growth of real graphs. 

Leskovec et al. [18] have proposed a graph generation model that addresses some of recently 
discovered properties of time-evolving graphs: densification and shrinking diameter. The main 
idea of their model is to recursively create self-similar graphs with certain degree of randomness. 
The self-similarity of the graphs is achieved by using the Kronecker product (also known as tensor 
product) [14] . which is a natural tool to construct self-similar structures. Given a seed graph, at 
each step this model computes the Kronecker product of two matrices that represent the seed graph 
and the graph generated in previous step respectively. The graphs generated with this method have 
regular structure. The model changes the entries in the target matrix with a certain probability 
before each multiplication to add randomness to the graph. 

3 Proposed Graph Generation Methods 
Parallel Barabasi- Albert (PBA) method 

Scale-free graphs can be easily generated using a well-known technique called preferential attach- 
ment [6]. In a simple serial model known as Barabasi- Albert (BA) model [3|, a scale- free graph is 
constructed, starting with a small clique, by repeatedly creating a vertex and attach it to one of 
the existing vertices with probability proportional to its current degree. 

We have parallelized the BA model in this research and propose a graph generation algorithm 
called parallel BA (PBA) method. In this method, vertices are distributed to the processors, and 
all the edges adjacent to a given vertex are stored on the same processor to which the vertex is 
assigned. 

Sets of processors called factions are used in the PBA method. Each processor belongs to one 
or more factions. The number of processors in each faction varies. Such variation is essential for 
the correct implementation of the preferential attachment operation in a distributed environment. 
Furthermore, we can assign the processors to factions in a manner to enable us to generate graphs 
with certain structures. The size of each faction is a degree of freedom in this method. The number 
of factions is another degree of freedom. To facilitate the implementation, we choose to assign all 
vertices on a single processor to the same set of factions. In other words, if two vertices reside on 
the same processor, then they are members of the same set of factions. 

It is crucial to use an efficient implementation of the preferential attachment to allow this method 
to scale. This can be done most efficiently by selecting an existing edge from the graph with a 
uniform probability and then randomly selecting one of its endpoints as the point to which a new 
vertex can be attached. Therefore, an edge can be added in constant time in this implementation. 
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A slight variation of this algorithm is used in the PBA method. The proposed PBA algorithm is 
described below in detail. It is assumed that the algorithm runs on a processor p. Other processors 
perform the same algorithm. We also assume that p is a member of factions .Fo, -^i, ■ ■ • , F n -i- 

In the PBA method, an edge is attached in two phases. In the first phase of our preferential 
attachment, k edges are added per newly created local vertex (a vertex that resides on p) as in the 
conventional BA model. However, each edge, e, associates a local vertex with some processor q, 
instead of connecting two vertices as in the serial model. The particular vertex that is to be the 
eventual endpoint of e is determined remotely by the processor q. 

The processor q is selected using a variation of the preferential attachment algorithm as follows. 
Let A denote a local edge list maintained by the processor p. First, we initialize A by associating 
the first s edges with the processors in factions Fq, F\, . . . , F n -±, matching sequentially one edge 
to one processor in the set of factions. Here, s is the total number of processors in factions Fq, Fi, 
. . . , F n _i (i.e., s = J27=o l-^D- For an edge ej, where j > s, we select an existing edge from A 
with a uniform probability (thus realizing preferential attachment) and then assign its associated 
processor to ej. This process is repeated until the predetermined number of local vertices and edges 
are created on p. At the end of the first phase, p sends a message to each processor q to notify the 
number of occurrences of q in A. 

In the second phase, p determines the endpoints for the edges on remote processors and connects 
the endpoints calculated by remote processors to its local vertices. The processor p first receives 
messages from other processors, which contain the numbers of occurrences of p in their respective 
local edge list. That is, the message received from a processor q represents the number of incomplete 
edges one of whose endpoints resides on the processor q. These edges are to be connected to the 
local vertices on p, selected by using the standard preferential attachment technique. Once the list 
of the vertices for the attachment is determined, it is divided up among the processors. Here, each 
processor is assigned as many vertices as it requested. The selected vertices are then sent to the 
corresponding processors. 

Having sent the endpoints for the remote edges, then p receives the lists of endpoints from other 
processors for its own incomplete edges. Using the remote vertices received, p completes its local 
partition of the graph. This is done by simply substituting each occurrence of processor q in A 
with the next endpoint in the list sent by q. The resulting collection of edges defines the portion 
of the graph stored on p. 

The two-phase preferential attachment is explained using an example in Figure [TJ In this ex- 
ample, we generate a graph with 5 vertices per processor and 2 edges per vertex. It is assumed 
that there are three factions, Fq = {Pi, P2}, F\ = {Pi, P3}, and F2 = {Pq, Pi} and processor 
Pq belongs to fractions Fq and F%. The vertices are assumed to be evenly distributed among the 
processors so that vertices 0-4 are on Pq, vertices 5-8 on Pi, and so on. 

In the first phase of the algorithm, Pq selects processors and associates them with the local 
vertices as shown in Figure HJa, where the edge list on Pq is depicted. Note that the first four 
processors in the list are the ones in the factions that Pq belongs to, Fq and F-i- The rest of the 
processors in the list are selected using the standard preferential attachment technique. At the end 
of phase 1, Pq needs four endpoints from Pi (and three endpoints from each of Pq and P2). These 
endpoints are determined by processor Pi via preferential attachment and sent to Pq in the second 
phase. In this example, we assume that vertices 8, 7, 5, and 8 are sent to Pq. Once receiving the 
list, Pq simply replaces the entries marked with Pi with the endpoints in the list. This is shown in 
Figure [Qb. 
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(a) Edge list on Pq at the end of phase 1 
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(b) A snapshot of edge list on Pq during the phase 2 



Figure 1: An example of PBA graph construction on processor Po- A list of edges, (u, v) is 
maintained on Pq, where u denotes a local vertex and v is an endpoint determined by remote 
processors in phase 2. In this example, three factions are used, where Fq = {Pi, P2}, Fi = {Pi, 
P3}, F2 = {Po, Pi}- Processor Pq belongs to fractions Fq and F2. 



We have found that it is useful to modify the algorithm slightly to incorporate some inter-faction 
edges. In particular, during the first phase, we occasionally select a processor that is not in any of 
the factions of p. Such processors are chosen randomly. The probability of creating an inter-faction 
edge is an another degree of freedom in this algorithm. 



Parallel Kronecker (PK) method 

A model that uses well-known concept of Kronecker matrix multiplication to generate scale-free 
graphs has been recently proposed [18] . If A is an m x n matrix and B is a p x q matrix, then the 
Kronecker product A <S> B is the mp x nq block matrix, defined as 



I a\\B ■ ■ ■ , a\ n B 



V a m iB • • • , a mn B 



The Kronecker product of two graphs is defined as the Kronecker product of their adjacency ma- 
trices. Figure [2] shows an example where a Kronecker graph is generated from given seed graph 
using the Kronecker graph multiplication. Since the Kronecker method is an ideal tool to construct 
self-similar structures, the graph generated by this method has also self-similar structure as shown 
in Figure [2jd. 

Implementing a serial algorithm for the above graph generation method is straightforward. Start- 
ing with an no x uq adjacency matrix with eo edges representing a given seed graph, we recursively 
construct larger adjacency matrices using Kronecker matrix multiplication in a top-down manner. 
To generate the ith matrix from the (i — l)th matrix, we simply replace every 1 in the ith matrix 
by an uq x uq block that is a copy of the seed graph. We replace every by an no x no block of 
zeros. So if the ith graph has rii vertices, the (i + l)th graph has rii x n, vertices. Thus, rii = n +1 
for all iterations. We treat each edge in the graph at each iteration as a meta-edge. A meta-edge 
is defined by its iteration and its position in the graph for that particular iteration. Given the size 
of a target graph, we can calculate the number of iterations required to generate the target graph. 
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(a) Seed graph G\ with 5 vertices (b) Adjacency matrix for G\ 
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(c) Adjacency matrix for G 2 = G\ ® G\ (d) Plot for G4 

Figure 2: An example of the Kronecker multiplication for graph generation. G2 is constructed by 
multiplying the seed graph with itself (that is, G2 = G\ (g> G\). The self-similarity in G 4 is clearly 
shown in (d). 

A Kronecker graph can be generated efficiently by using a stack, initialized with the edges in 
the seed graph. A graph is generated by expanding meta-edges in the stack as follows. First, a 
meta-edge on the top of the stack is popped up. If its iteration, i, is equal to the predetermined final 
iteration, then the edge is added to the final graph. Otherwise, new meta-edges with iteration i + 1 
are generated and pushed onto the stack. This operation is repeated until the stack is depleted. We 
choose a stack because it guarantees that the memory requirement is limited to 0( e ^/\E\), where 
\E\ is the number of edges in the final graph. An implementation using a queue is not scalable, as 
it would require 0(|.E|) memory space. 

In the parallel implementation of the Kronecker method, the meta-edges are divided among the 
groups of processors at each iteration. Each processor group generates the same meta-edges at 
a given iteration. If there are more processors in a processor group than there are edges in that 
group's portion of the graph, then each processor in the group is assigned to a single edge in the 
stack. Here, each edge defines a new processor group that is a subset of the original, and the 
process group ignores all other meta-edges at that iteration. On the other hand, if there are more 
meta-edges than processors in the processor group, the edges are divided as evenly as possible 
among those processors. Each of those processors is then in a singleton processor group for the 
remaining iterations. Each processor must be able to calculate on the fly which meta-edges are in 
its processor group at a given iteration. 

In general, it is difficult to achieve good load balance with the PK method, as some processors 
may be assigned more work depending on processor group sizes. A dynamic load balancing scheme 
may be used in conjunction with the PK method to overcome this limitation. Furthermore, some 
randomization logics are needed to irregulate the structure of the PK graphs. One approach for 
the randomization is to add or delete meta-edges during the replacement phase at each iteration 
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Methods 


\V\ (Million) 


\E\ (Billion) 


Time (Seconds) 


PBA 
PK 


1,000 
0.53 


5 

5.4 


12.39 
2.53 



Table 1: Comparison of graph generation time by PBA and PK methods. The number of vertices 
and edges in generated graphs are denoted by |V| and \E\, respectively. 



by randomly modifying the seed graph temporarily. Another approach is to perform exclusive-OR 
operation between the final adjacency matrix with the adjacency matrix for a random graph. 

4 Experimental Results 

4.1 Experiment environment and metrics of interest 

We have conducted a study to evaluate the proposed graph generators. The experiments were 
conducted on MCR [20], a large Linux cluster located at Lawrence Livermore National Laboratory. 
MCR has 1,152 nodes interconnected with a Quadrics switch, and each of the compute nodes has 
two 2.4 GHz Intel Pentium 4 Xeon processors and 4 GB of memory, 

In this study, we are mainly interested in evaluating the performance of the proposed graph 
generators and analyzing the graphs they generate. There are well-known structural and temporal 
properties of the real complex networks [19] , We use widely-accepted properties as indices to 
quantitatively evaluate the synthetic graph generated by the proposed methods. 

4.2 Results 

Two graphs were generated by using the PBA and PK methods on 1,000 processors on the MCR 
cluster, and we report the graph generation times in Table [TJ The generation time is an average 
of multiple runs. We have measured the maximum time across all processes in each run. The disk 
I/O time is not included in the time reported. 

Both graphs have about 5 billion edgesQ The number of vertices in the PK graph is considerably 
smaller than that in the PBA graph due to our use of a seed graph with large average degree. As 
shown in the table, it takes less than 13 seconds to generate these massive graphs. The high 
generation rate can be attributed to the high degree of parallelism of the proposed algorithms. 

The performance of both methods is further detailed in Figure [3l where we show the results from 
a weak-scaling study. In a weak-scaling test, the global problem size increases as the total number 
of processors increases such that the size of local problem remains constant. The local problem 
size of roughly one million vertices and three million edges is used. The figure reveals that the PK 
method is about four times faster than the PBA method. In particular, the almost flat curve for the 
PK method highlights the embarrassingly-parallel nature of the algorithm. The graph generation 
time for the PBA method, on the other hand, increases as the number of processors increases. This 
is because in the PBA method each processor processes endpoint vertices sent by remote processors 

1 We measure the size of a graph by the total space needed to store the graph in this paper. That is, given a graph 
G = (V, E), its size is \V\ + \E\. Therefore, we consider the PBA graph to be larger than the PK graph in this 
experiment. 
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Figure 3: Weak-scaling results 

at the end of the execution, and the complexity of the processing increases in proportion to the 
total number of processors used. Profiling of the code confirms that each process spends most of 
its time in processing the received endpoints. 

In the remaining of this section, we analyze the graphs produced by the proposed methods. A 
PBA graph studied in this experiments has 330,000 vertices and 2 million edges. A PK graph 
with 160,000 vertices and 28 million edges, constructed using a small seed graph with 20 vertices 
and 40 edges, is analyzed. We also consider two real-world graphs, WWW and router graphs, for 
comparison. The WWW graph has 325,000 vertices and 2.1 million edges, and the smaller router 
graph has 285,000 vertices and 861,000 edges. 

Figure 0] presents the degree distributions of the synthetic graphs and compares them with that 
of the router graph. The graphs are shown in a log-log scale. As shown in the figure, the curves for 
both PBA and PK graphs are heavy-tailed. This is a signature of power-law degree distribution that 
is one of the widely accepted property of real- world complex networks. It is shown in Figure 01c 
that the router graph also has fat-tailed degree distribution. To verify that these graphs have 
power-law degree distributions, P(k) oc A; -7 , we have performed curve fittings for the measured 
degree distributions and show the exponent of the power-law distribution (7) in Figure HI As shown 
in the figure, 7 values for the three graphs are greater than 2. This finding coincides with the fact 
that if the average degree of a scale-free graph is finite, then its 7 value should be 2 < 7 < 00 [6]. 
The PK graph has a large number of high degree vertices. This is because the number of low-degree 
vertices is small in a graph generated by the PK method, in which the degree of a vertex grows 
exponentially. We can change the degree distribution by randomly adding low-degree vertices to 
the final graph. 

Table [2] presents the average path lengths and diameters of the two synthetic graphs considered 
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Figure 4: Degree distributions of the synthetic and router graphs. Here k and P(k) denote the 
degree and the number of vertices with degree k, respectively. The 7 exponent of a power-law 
distribution, P(k) oc k~ J , for each graph is obtained through curve fittings and shown here as well. 
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Graph 


Avg. Path Length 


Diameter (estimated) 


WWW Graph 


7.54 


46 


Router Graph 


8.87 


27 


PK Graph 


3.20 


5 


PBA Graph 


6.26 


12 



Table 2: The comparison of path length and diameter of the synthetic graphs with two real graphs. 
Both metrics are estimated by sampling to reduce the computation overhead. 
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(a) PBA graph (b) PK graph 

Figure 5: Communities within PBA and PK graphs. The graphs are represented as adjacency 
matrices. 

in the previous experiment as well as the WWW and router graphs. Both metrics are estimates 
obtained through sampling to reduce the computation time. Each of the graphs analyzed has 
short average path length, which is the average value of the shorted path between two randomly 
chosen vertices. Further, each synthetic graph has a small diameter that is the maximum of all- 
pairs shortest path. These results indicate that the graphs generated by the proposed methods 
have small world property, which is another key characteristic of real- world complex networks. 
Obviously, such small-worldness is more evident in the PK graph, as it contains a large number of 
high-degree vertices (or hubs). Two real graphs, the WWW graph in particular, appear to have 
the smaller number of hubs as indicated by the larger diameters. 

In Figure we show two adjacency matrices for PBA and PK graphs to visualize the community 
structures within these synthetic graphs. As shown in the figure, the PBA and PK graphs have 
clearly identifiable community structures. A major difference between the two graphs is that the 
PK graph has more regularly-structured communities compared to the PBA graph. The regular 
community structure of the PK graph is the result of the systematic way of graph construction by 
the PK method (using the Kronecker matrix multiplication). In addition, the self-similar nature of 
the Kronecker product is translated into the communities- within-communities structure in the PK 
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4.3 Comparison of the PBA and PK methods 

An advantage of the PK method over the PBA method is its higher degree of parallelism. The PK 
method is embarrassingly parallel, as once a seed graph is given, processors generate the assigned 
portions of the target graph independent of each other. A key limitation of the PK method is that 
the structure of a resulting graph heavily depends on that of the initial seed graph. In fact, even 
with randomized edge generation and removal, the structure of final graph largely depends on the 
seed graph and thus relatively regular. In consequence, this limitation makes it very difficult to 
configure the PK method to generate a graph with desired property. For example, if the seed graph 
is too small it is very difficult to control the degree of vertices. To control the vertex degree, we 
need a relatively large seed graph, but with such a large seed graph, it is hard to control the size 
of the final graph. 

Although slower than the PK method, the PBA method is still a very fast algorithm. An obvious 
advantage of the PBA method is that using preferential attachment as a key means to construct a 
graph, the method can be easily configured to generate a graph of desired size and properties. 

5 Conclusions and Future Work 

Two efficient and scalable parallel graph generation methods that can generate scale-free graphs 
with billions of vertices and edges are proposed in this paper. The proposed parallel Barabasi- 
Albert (PBA) method iteratively builds scale-free graphs using two-phase preferential attachment 
technique in a bottom-up fashion. The parallel Kronecker (PK) method, on the other hand, con- 
structs a graph recursively in a top-down fashion from a given seed graph using Kronecker matrix 
multiplication. These parallel graph generators operate with high degree of parallelism. We have 
generated a graph with 1 billion vertices and 5 billion edges in less than 13 seconds on a large 
cluster. This is the highest rate of graph generation reported in the literature. We have analyzed 
the graphs produced by our methods and shown that they have the most common properties of the 
real complex networks such as power-law degree distribution, small- worldness, and communities- 
within-communities. 

There are other known and somewhat debatable properties of complex networks. A rigorous 
study of the graphs generated by the proposed methods will reveal whether these methods can 
produce synthetic graphs with these properties. This study will also provide us with better un- 
derstanding of how the logics used in our algorithms affect the properties of the synthetic graphs 
they generate. Based on this study, we will develop a set of pre- and post-generation process- 
ing and randomization techniques that will enable us to construct a synthetic graph with desired 
properties. 
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